Exploration and Exploitation in Parkinson’s Disease: Behavioral Analyses

Authors
Affiliations

Björn Meder

Health and Medical University, Potsdam, Germany

Martha Sterf

Medical School Berlin, Berlin, Germany

Charley M. Wu

University of Tübingen, Tübingen, Germany

Matthias Guggenmos

Health and Medical University, Potsdam, Germany

Published

August 2, 2025

Code
# Housekeeping: Load packages and helper functions
# Housekeeping
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(message = FALSE)
knitr::opts_chunk$set(warning = FALSE)
knitr::opts_chunk$set(fig.align='center')
knitr::opts_chunk$set(prefer_html = TRUE)

options(knitr.kable.NA = '')
options(brms.backend = "cmdstanr") # more stability for M1 Mac

packages <- c('gridExtra', 'BayesFactor', 'tidyverse', "RColorBrewer", "lme4", "sjPlot", "lsr", "brms", "kableExtra", "afex", "emmeans", "viridis", "ggpubr", "hms", "scales", "cowplot", "gtsummary", "webshot", "webshot2", "parameters", "bridgesampling", "cmdstanr",  "magick", "grid")
lapply(packages, require, character.only = TRUE)

set.seed(0815)

# file with various statistical functions, among other things it provides tests for Bayes Factors (BFs)
source('statisticalTests.R')

# Wrapper for brm models such that it saves the full model the first time it is run, otherwise it loads it from disk
run_model <- function(expr, modelName, path='brm', reuse = TRUE) {
  path <- paste0(path,'/', modelName, ".brm")
  if (reuse) {
    fit <- suppressWarnings(try(readRDS(path), silent = TRUE))
  }
  if (is(fit, "try-error")) {
    fit <- eval(expr)
    saveRDS(fit, file = path)
  }
  fit
}

# Setting some plotting params
w_box          <- 0.2      # width of boxplot, also used for jittering points and lines    
line_jitter    <- w_box / 2
xAnnotate      <- -0.3

# jitter params
jit_height  <- 0.01
jit_width   <- 0.05
jit_alpha   <- 0.6

# colors 
groupcolors    <- c("#7570b3", "#1b9e77", "#d95f02")
choice3_colors <- c("#e7298a", "#66a61e", "#e6ab02")
Code
########################################################
# get behavioral data
########################################################
dat_gridsearch <- read_csv("data/data_gridsearch_Parkinson.csv", show_col_types = FALSE) %>% 
  mutate(type_choice  = factor(type_choice, levels = c("Repeat", "Near", "Far"))) 

# normalize reward and previous reward
dat_gridsearch$z = dat_gridsearch$z / 50
dat_gridsearch$previous_reward = dat_gridsearch$previous_reward / 50

########################################################
# get bonus round data
########################################################
dat_bonus   <- read_csv(file="data/data_gridsearch_Parkinson_bonusround.csv") %>% 
  mutate(bonus_environment = as.factor(bonus_environment))

########################################################
# get subject data
########################################################
dat_sample <- read_delim("data/data_gridsearch_subjects.csv", escape_double = FALSE, trim_ws = TRUE, show_col_types = FALSE) %>% 
  mutate(gender = as.factor(gender),
        # group  = factor(group, levels = c("Control", "PD on", "PD off"))) %>% 
         #group  = factor(group, levels = c("Control", "PD on", "PD off"), labels = c("Control", "PD on", "PD off"))) %>% 
        group = fct_recode(group,
                       "Control" = "PNP",
                       "PD on"  = "PD+",
                       "PD off" = "PD-"),
    group = fct_relevel(group, "Control", "PD on", "PD off")
  ) %>% 
  mutate(last_ldopa = if_else(group != "Control", as_hms(last_ldopa), as_hms(NA)),
         next_ldopa = if_else(group != "Control", as_hms(next_ldopa), as_hms(NA)),
         time_exp = if_else(group != "Control", as_hms(time_exp), as_hms(NA))) %>% 
  mutate(time_since_ldopa = as.numeric(time_exp - last_ldopa, unit = "mins"))

# combine behavioral and subject data
dat <- dat_sample %>% 
  left_join(dat_gridsearch, by = "id") %>% 
  arrange(group)

1 Abstract

We investigated how patients with Parkinson’s disease (PD) balance the explore-exploit trade-off using a spatially correlated bandit task, where the spatial structure of rewards facilitated value generalization (i.e., nearby options yield similar rewards). Participants were tested either shortly after taking their regular Levodopa (L-Dopa) dose (N=33) or just before their next scheduled dose (N=32). Patients with polyneuropathy served as a control group (N=34), comparable in age, depressive symptoms, and basic cognitive functioning. Behavioral and computational analyses revealed distinct patterns of exploration and exploitation. PD patients on L-Dopa balanced exploration and exploitation, though not as efficiently as polyneuropathy patients. In stark contrast, patients off L-Dopa rarely exploited known high-value options and primarily explored novel ones. This overreliance on exploration impaired their ability to navigate the explore-exploit trade-off and maximize rewards. To better understand the mechanisms underlying these behavioral differences, we employed a computational approach using the Gaussian Process Upper Confidence Bound (GP-UCB) model. This model integrates similarity-based generalization with two distinct exploration mechanisms: directed exploration, which seeks to reduce uncertainty about rewards, and random exploration, which introduces stochastic variability in choice behavior. The model parameters showed that behavioral differences between the on- and off-medication conditions were primarily driven by differences in uncertainty-directed exploration, while the level of random exploration remained unchanged. Both PD groups showed reduced generalization compared to controls, contributing to poorer overall performance. Our findings indicate that L-Dopa selectively modulates uncertainty-directed exploration, providing a more nuanced understanding of the central role of dopamine in the regulation of exploratory behavior.

2 Intro

A central distinction between different forms of exploration behavior is that directed exploration reflects the drive for knowledge about novel options, whereas undirected exploration refers to random variability in the choice process (Giron et al., 2023; Meder et al., 2021; Sadeghiyeh et al., 2020; Schulz et al., 2019; Wu et al., 2018).

[TO BE CONTINUED]

3 Experiment

We investigated how patients with Parkinson’s disease (PD) manage the explore-exploit trade-off using a spatially correlated multi-armed bandit task. Participants accumulated rewards by selecting tiles (options) with normally distributed rewards. The spatial correlation between rewards facilitated generalization, allowing participants to adapt to the structure of the environment and balance exploring new options versus exploiting known high-reward options.

Screenshot from experiment, procedure and example environments

Screenshot from experiment, procedure and example environments

3.1 Materials and procedure

40 distinct environments were generated using a radial basis function kernel with \(\lambda = 1\), creating a bivariate reward function on a grid that maps each tile location to a specific reward value. These reward functions gradually varied across the grid, creating environments with spatially-correlated rewards. The correlation between neighboring options is about r≈0.6, exponentially decreasing with distance.

Participants completed 10 rounds of the task, each featuring a new environment drawn without replacement from the set of 40 environments. In each round, participants had 25 choices to accumulate rewards. The first round served as a tutorial to familiarize participants with the task and was excluded from the analyses. The final round (round 10) was a bonus round where, after 15 choices, participants were asked to predict rewards for five unrevealed options. Data from this round were also excluded from the main analysis and analyzed separately.

At the start of each round, one tile was randomly revealed, and participants sequentially sampled 25 tiles. On each trial, they could choose to either click a new tile or re-click a previously selected tile. Selections were made by selecting the tile on the computer screen using a mouse, upon which they received a reward arbitrarily scaled to the range [0,50]. Re-clicked tiles showed small variations in reward due to normally distributed noise.

3.2 Sample

We collected data from adult participants with Parkinson’s disease (PD) who regularly receive Levodopa (L-Dopa) for symptomatic treatment (Abbott, 2010; Tambasco et al., 2018). Participants were recruited via a neurologist’s outpatient practice. Eligible participants were evaluated based on Hoehn-Yahr scores recorded in their patient files. The scale assesses disease severity and motor impairments based on a score from 1 to 5, with higher scores indicating greater severity (Goetz et al., 2004; Hoehn & Yahr, 1967). We limited recruitment to individuals with scores between 1 and 3, as scores of 4 and 5 reflect severe impairment.

PD patients were randomly assigned to two conditions: on medication (PD on) and off medication (PD off). In the PD on group (N=33), patients’ scheduled L-Dopa was administered at least 30 minutes before the start of the experiment. In the PD off group (N=32), the next scheduled dose for participants was timed such that they were in a low dopamine state during the experiment, offering a clear contrast to the PD on group. Thus, we refer to the ‘on medication’ condition as the state after taking L-Dopa and the ‘off medication’ condition as the state before their next scheduled dose.

The comparison group (N=34) consisted of individuals of similar age diagnosed with polyneuropathies (Control), a condition affecting the peripheral nervous system that can lead to physical symptoms such as pain, sensory loss, or motor weakness. However, unlike Parkinson’s disease, Control does not involve central dopaminergic dysfunction or cognitive impairment.

3.3 Clinical assessment

To characterize participants’ clinical status, we employed standardized measures assessing Parkinson’s disease severity, basic cognitive function, and depressive symptoms. PD severity was evaluated using the Hoehn-Yahr scale, which rates motor impairments such as postural instability and gait difficulties (Hoehn & Yahr, 1967). Participants can receive a score between one and five, with higher scores indicating more severe problems. Basic cognitive function of all participants was assessed through the Mini-Mental State Examination (MMSE), which is frequently used in in patients with dementia (Folstein et al., 1975). The test comprises 30 questions pertaining to different domains, including memory (e.g., recalling three objects), temporal and spatial orientation (e.g., date and location), and arithmetic ability. Finally, all participants answered the German version of the Beck Depression Inventory II, a self-report inventory consisting of 21 items measuring depressive symptoms (Beck et al., 1996; Hautzinger et al., 2006).

3.4 Sample characteristics

Note

exclude participant with pump

Table 1 shows the demographics of our sample, along with their Hoehn-Yahr, MMSE , and BDI scores. In the PD on group, the mean time since their last L-Dopa dose was 104 min; in the PD off group it was 253min.

Table 1
Code
# dat_sample %>% 
#   group_by(group) %>% 
#   summarise(n = n(),
#             female = sum(gender == "f"),
#             mean_age = mean(age),
#             sd_age = sd(age),
#             mean_BDI = mean(BDI, na.rm= T), 
#             mean_MMSE = mean(MMSE, na.rm= T),
#             mean_HY = mean(hoehn_yahr, na.rm= T),
#             mean_time_since_ldopa = mean(time_since_ldopa,  na.rm= T)) %>% 
#   
#   kable(., format = "html", escape = FALSE, digits = 1) %>%
#   kable_styling("striped", full_width = FALSE)

tbl_demographics <- 
dat_sample %>%
  mutate(
    gender = factor(gender, levels = c("m","f"),
                    labels = c("Male","Female"))
  ) %>%
  select(group, gender, age, BDI, MMSE, hoehn_yahr, time_since_ldopa) %>%
  tbl_summary(
    by = group,
    type = list(
      gender                                          ~ "categorical",
      c(age, BDI, MMSE, hoehn_yahr, time_since_ldopa)  ~ "continuous"
    ),
    label = list(
      age        ~ "Age",
      hoehn_yahr ~ "PD severity (Hoehn-Yahr)",
      BDI        ~ "Depression (BDI)",
      time_since_ldopa        ~ "Time since medication (min)"
    ),
    statistic = list(
      all_categorical() ~ "{n} ({p}%)",
      all_continuous()  ~ "{mean} ({sd})"
    ),
    digits  = all_continuous() ~ 1,
    missing = "no"
  ) %>%
  add_p(
    # global ANOVA for continuous, chi-square for categorical
    test = list(
      all_continuous()   ~ "oneway.test",
      all_categorical()  ~ "chisq.test"
    )
  ) %>%
  modify_header(
    label   ~ "**Variable**",
    p.value ~ "**p-value**"
  ) %>%
  bold_labels()  %>%
  modify_fmt_fun( # replace  "NA (NA)" or "NA" with a dash
    all_stat_cols() ~ function(x) {
      x[x %in% c("NA (NA)", "NA")] <- " "
      x
    }
  ) %>% 
  modify_header(label ~ "") 

# GENERATE LATEX CODE
# as_gt(tbl_demographics) %>% gt::as_latex()
# as_kable_extra(tbl_demographics, format = "latex")
# as_kable(tbl, format = "latex")

4 Behavioral data

All behavioral data are stored in data_gridsearch_parkinson.csv, which contains the following variables (Table 2):

  • id: participant id
  • age is participant age in years
  • gender: (m)ale, (f)emale, (d)iverse
  • x and y are the sampled coordinates on the grid
  • chosen: are the x and y coordinates of the chosen tile
  • z is the reward obtained from the chosen tile, normalized to the range 0-1. Re-clicked tiles could show small variations in the observed color (i.e., underlying reward) due to normally distributed noise,\(\epsilon∼N(0,1)\).
  • z_scaled is the observed outcome (reward), scaled in each round to a randomly drawn maximum value in the range of 70% to 90% of the highest reward value
  • trial is the trial number (0-25), with 0 corresponding to the initially revealed random tile, i.e. trial 1 is the first choice
  • round is the round number (1 through 10), with 1=practice round (not analyzed) and 10=bonus round (analyzed only for bonus round judgments)
  • distance is the Manhattan distance between consecutive clicks. NA for trial 0, the initially revealed random tile
  • type_choice categorizes consecutive clicks as “repeat” (clicking the same tile as in the previous round), “near” (clicking a directly neighboring tile, i.e. distance=1), and “far” (clicking a tile with distance > 1). NA for trial 0, i.e., the initially revealed random tile.
  • previous_reward is the reward z obtained on the previous step. NA for trial 0, i.e., the initially revealed random tile.
  • last_ldopa: time of the last L-Dopa dose (HH:MM)
  • next_ldopa: scheduled time of the next L-Dopa dose (HH:MM)
  • time_exp: time of the experiment (HH:MM)
  • time_since_ldopa: time since last L-Dopa (in minutes)

We analyzed the behavioral data in terms of performance and exploration behavior. These analyses exclude the tutorial and bonus rounds, leaving a total of 200 search decisions (8 rounds \(\times\) 25 trials) for each participant. We then report the results of the bonus round, where we analyze participants’ reward predictions and confidence judgments. We report both frequentist statistics and Bayes factors (\(BF\)) to quantify the relative evidence of the data in favor of the alternative hypothesis (\(H_A\)) over the null hypothesis (\(H_0\)); see Appendix for details and references. Various helper functions are implemented in statisticalTests.R. Regression analyses were performed in a Bayesian framework with Stan, accessed via R-package brms, complemented by frequentist hierarchical regression analyses (via package lmer).

Code
# show example data
head(dat %>%
       group_by(group) %>%
       slice_head(n=25)  %>% 
       ungroup()) %>%
  kable("html", caption = "Example behavioral data.") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>% 
  scroll_box(width = "100%", height = "300px")
Table 2
Example behavioral data.
id age gender group BDI MMSE hoehn_yahr last_ldopa next_ldopa time_exp time_since_ldopa session x y chosen z zscaled time trial round distance type_choice previous_reward
111 54 f Control 12 30 1 3 3 28 0.38 18 1.719326e+12 0 1
111 54 f Control 12 30 1 3 3 28 0.38 18 1.719326e+12 1 1 0 Repeat 0.38
111 54 f Control 12 30 1 3 4 36 0.52 22 1.719326e+12 2 1 1 Near 0.38
111 54 f Control 12 30 1 4 5 45 0.64 26 1.719326e+12 3 1 2 Far 0.52
111 54 f Control 12 30 1 5 6 54 1.00 38 1.719326e+12 4 1 2 Far 0.64
111 54 f Control 12 30 1 6 7 63 0.26 14 1.719326e+12 5 1 2 Far 1.00

Example data.

4.1 Performance: Rewards by round

Code
# mean reward per subject (practice and bonus round excluded)
df_mean_reward_subject_by_round <- dat %>% 
  filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
  group_by(id, round) %>% 
  summarise(age = mean(age),
            group = first(group),
            sum_reward = sum(z),
            mean_reward = mean(z), 
            sd_reward = sd(z)) 

df_summary_by_round <- df_mean_reward_subject_by_round %>%
  group_by(round, group) %>%
  summarize(
    mean_of_means = mean(mean_reward, na.rm = TRUE),  # Renaming to avoid confusion
    se_reward = sd(mean_reward, na.rm = TRUE) / sqrt(n()),  # Standard error
    .groups = 'drop'
  )

aov_rounds <- aov_ez(
  id = "id",                 
  dv = "mean_reward",        
  within = "round",          
  between = "group",         
  data = df_mean_reward_subject_by_round
)


# kable(as.data.frame(aov_rounds$anova_table), 
#       format = "html", escape = FALSE, digits = 2, 
#       caption = "ANOVA results with round as within-subjects factor and group as between subjects factor, where rewards per round were first aggregated within subjects.") %>%
#   kable_styling("striped", full_width = FALSE)


brm_rounds <- run_model(
  expr = quote(brm(
    mean_reward ~ round * group + (1 | id),
    data = df_mean_reward_subject_by_round,
    family = gaussian(),
    iter = 4000,
    warmup = 1000,
    chains = 4,
    cores = 4,
    seed = 0511,
    backend = "cmdstanr",
    save_pars = save_pars(all = TRUE)
  )),
  modelName = 'brm_reward_rounds'
)

# Extract fitted values and add to data df
fitted_values <- fitted(brm_rounds, re_formula = NA)
df_mean_reward_subject_by_round$fitted_mean_reward <- fitted_values[, "Estimate"]

p <- ggplot(df_mean_reward_subject_by_round, aes(x = round, y = mean_reward, group = group, shape = group, color = group)) +
  geom_point(data = df_summary_by_round, aes(x = round, y = mean_of_means, shape = group), size = 3) +
  geom_line(aes(y = fitted_mean_reward), linewidth = 1) +  
  geom_jitter(aes(x = round, y = mean_reward), size = 1, alpha = 0.3, width = 0.2) +
  scale_y_continuous("Mean Reward", breaks = c(25,30,35)) +
  xlab("Round") +
  scale_fill_manual(values = groupcolors) +
  scale_color_manual(values = groupcolors) +
  ggtitle("Mean Reward by Rounds and Group (brms)") +
  theme_classic() +
  theme(legend.title = element_blank())

# tbl_regression(brm_rounds, exponentiate = F) 
#tab_model(brm_rounds)

# Reduced model for computing BF: no round term
brm_rounds_reduced <- run_model(brm(
  mean_reward ~ group + (1 | id),
  data = df_mean_reward_subject_by_round,
  family = gaussian(),
  iter = 4000, 
  warmup = 1000, 
  chains = 4, 
  cores = 4, 
  seed = 0511,
  backend = "cmdstanr",
  save_pars = save_pars(all = TRUE)),
modelName='brm_reward_rounds_reduced')

# Compute Bayes Factor: Full vs. Reduced (without round term)
#bf_brm_rounds <- bayes_factor(brm_rounds, brm_rounds_reduced)


# format_parameters(brm_rounds)
# 
# params <- model_parameters(brm_rounds) |> print_md()
# params$Parameter <- gsub("groupPDP", "PD on", params$Parameter)
# params$Parameter <- gsub("groupPDP", "PD on", params$Parameter)
# params$Term <- gsub("round", "Round", params$Term)
# 
# 
# params$Term <- gsub("groupPDP", "PD", params$Term)

# plot_model(brm_rounds, type = "est") +
#   theme_classic()

Figure 1 shows the obtained rewards by round, for each group. An ANOVA with round as within- and group as between-subjects factor showed a difference between groups, with Control patients achieving the greatest rewards, followed by PD on and PD off patients, but no change across rounds (and no interaction). A Bayesian regression analysis yielded comparable results, with the estimated effect of round on mean reward being very small (Estimate = 0, 95% CI [-0.01, 0.01]) and the credible interval including zero. In the subsequent analyses, we therefore aggregate across rounds.

Code
# Plot the mean reward by round for each group with dodged points and error bars
ggplot(df_summary_by_round, aes(x = round, y = mean_of_means, group = group, shape = group,  color = group, fill = group)) +
  geom_line(position = position_dodge(width = 0.3)) +  
  # geom_errorbar(aes(ymin = mean_of_means - 1.96 * se_reward, ymax = mean_of_means + 1.96 * se_reward), width = 0.2, position = dodge, color = "black") +  # 
  geom_errorbar(aes(ymin = mean_of_means - se_reward, ymax = mean_of_means + se_reward), width = 0.2, position = position_dodge(width = 0.3), alpha=0.7) +  #
  geom_point(position = position_dodge(width = 0.3), size = 3, stroke = 1, alpha=.9) +  
  #coord_cartesian(ylim = c(20,40)) +
  coord_cartesian(ylim = c(0.4,0.75)) +
  #scale_shape_manual(values = c(21, 24, 22)) +  # circle, triangle, and square
  scale_fill_manual(values = groupcolors) +  
  scale_color_manual(values = groupcolors) +  
  #scale_color_manual(values = c("black","black","black")) +  
  # scale_y_continuous("Mean reward ± 95% CI") +
  scale_y_continuous("Mean reward ± SE") +
  scale_x_continuous("Round", breaks = 2:9) +
  theme_classic() +
  theme(legend.title = element_blank(),
      plot.title = element_text(size=16),
      axis.text = element_text(size=14),
      axis.title = element_text(size=14))

# ggsave("plots/performance_rounds.png", width = 6, height = 3)
Figure 1: Performance over rounds (excluding tutorial and bonus round).

4.2 Performance: Rewards by group

Code
df_mean_reward_subject <- dat %>% 
  filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
  group_by(id) %>% 
  summarise(age = mean(age),
            group = first(group),
            sum_reward = sum(z),
            mean_reward = mean(z), 
            sd_reward = sd(z),
            BDI = first(BDI),          
            MMSE = first(MMSE),  
            hoehn_yahr = first(hoehn_yahr))

# some summary stats for obtained mean rewards
# df_mean_reward_subject %>%
#   group_by(group) %>%
#   summarise(n = n(),
#             m_reward = mean(mean_reward),
#             md_reward = median(mean_reward),
#             var_reward = var(mean_reward),
#             sd_reward = sd(mean_reward),
#             se_reward = sd_reward / sqrt(n),
#             lower_ci_reward = m_reward - qt(1 - (0.05 / 2), n - 1) * se_reward,
#             upper_ci_reward = m_reward + qt(1 - (0.05 / 2), n - 1) * se_reward) %>%
#   
#   kable(., format = "html", escape = FALSE, digits = 2) %>%
#   kable_styling("striped", full_width = FALSE)

Figure 2 shows the overall performance of each group, based on each subject’s mean reward across all trials. Control participants achieved higher rewards than both PD patients on medication (\(t(65)=2.5\), \(p=.014\), \(d=0.6\), \(BF=3.5\)) and off medication (\(t(63)=7.2\), \(p<.001\), \(d=1.8\), \(BF>100\)). Notably, PD patients on medication achieved substantially higher rewards than patients off medication (\(t(62)=5.9\), \(p<.001\), \(d=1.5\), \(BF>100\)), indicating as strong beneficial effect of L-Dopa on the ability to balance exploration and exploitation.

  • PD on vs. PD off: \(t(62)=5.9\), \(p<.001\), \(d=1.5\), \(BF>100\)
  • Control vs. PD off: \(t(63)=7.2\), \(p<.001\), \(d=1.8\), \(BF>100\)
  • Control vs. PD on:\(t(65)=2.5\), \(p=.014\), \(d=0.6\), \(BF=3.5\)
Code
# Boxplots of rewards by group
p_performance <- ggplot(df_mean_reward_subject, aes(x = group, y = mean_reward, color = group, fill = group, shape = group)) +
  geom_hline(yintercept=.5, linetype='dashed', color='black') + # random choice model (mean across all 40 environments)
  geom_boxplot(alpha = 0.2, outlier.shape = NA, width = 0.4) +  
  geom_jitter(width = 0.15, size = 2, alpha = 0.8) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 4) +  
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  ylab("Mean normalized reward") +
  xlab("") +
  ggtitle("Performance") +
  theme_classic() +
  theme(
    legend.position = 'none',
    legend.title = element_blank(),
      plot.title = element_text(size=24),
      axis.text = element_text(size=18),
      axis.title = element_text(size=18)
  )

p_performance
ggsave("plots/performance.png", p_performance, dpi=300, height = 5, width = 6 )

# Control vs. PD on
# ttestBF(subset(df_mean_reward_subject, group == 'Control')$mean_reward, subset(df_mean_reward_subject, group == 'PD on')$mean_reward, var.equal = TRUE)
# t.test(subset(df_mean_reward_subject, group == 'Control')$mean_reward, subset(df_mean_reward_subject, group == 'PD on' )$mean_reward, var.equal = T)

# Control vs. PD off
# ttestBF(subset(df_mean_reward_subject, group == 'Control')$mean_reward, subset(df_mean_reward_subject, group == 'PD off')$mean_reward, var.equal = TRUE)
# t.test(subset(df_mean_reward_subject, group == 'Control')$mean_reward, subset(df_mean_reward_subject, group == 'PD off' )$mean_reward, var.equal = T)

# PPD vs. PD off
# ttestBF(subset(df_mean_reward_subject, group == 'PD on')$mean_reward, subset(df_mean_reward_subject, group == 'PD off')$mean_reward, var.equal = TRUE)
# t.test(subset(df_mean_reward_subject, group == 'PD on')$mean_reward, subset(df_mean_reward_subject, group == 'PD off' )$mean_reward, var.equal = T)

# Plot for Computational Psychiatry Conference (Tübingen, July 2025)
# p_performance_by_group_CPP <- ggplot(df_mean_reward_subject, aes(x = group, y = mean_reward, color = group, fill = group, shape = group)) +
#   #geom_hline(yintercept=.5, linetype='dashed', color='red') + # random choice model (mean across all 40 environments)
#   geom_boxplot(alpha = 0.2, outlier.shape = NA, width = 0.5) +  
#   geom_jitter(width = 0.15, size = 2, alpha = 0.8) +  
#   stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +  
#   scale_color_manual(values = groupcolors) +
#   scale_fill_manual(values = groupcolors) +
#   ylab("Mean normalized reward") +
#   xlab("") +
#   ggtitle("Performance") +
#   theme_classic() +
#   theme(
#     strip.background = element_blank(),
#     strip.text = element_text(color = "black", size = 20),
#     legend.position = 'none',
#     legend.title = element_blank(),
#       plot.title = element_text(size=24),
#       axis.text = element_text(size=20),
#       axis.title = element_text(size=20)
#   )
# 
# ggsave("plots/performance_by_group_CPP.png", p_performance_by_group_CPP, dpi=300, height = 5, width = 6 )
# ggsave("plots/performance_by_group_CPP.pdf", p_performance_by_group_CPP, height = 5, width = 6 )
Figure 2: Obtained rewards by group. Each dot is one participants’ mean reward across all rounds and trials.

4.3 Performance: Learning curves

Participants’ learning curves (Figure 3) show the average reward obtained in each trial across rounds. For both polyneuropathy patients (Control) and PD patients on medication (PD on), the mean rewards increased as the round progresses, suggesting they effectively balanced exploration and exploitation to maximize rewards. In stark contrast, PD patients off medication (PD off) showed no improvement across trials.

Figure 3: Learning curves, showing obtained mean reward for each trial, aggregated across rounds.

4.3.1 Performance: Role of physiological and cognitive assessments (BDI, MMSE, Hoehn-Yahr)

We assessed patients in terms of their depressive symptoms (via BDI-II), cognitive functioning (via Mini-Mental-Status Examination, MMSE), and severity of motor symptoms (via Hoehn-Yahr scale, Parkinson’s disease patients only). We ran a hierarchical regression with reward as dependent variable and group, BDI score, and MMSE score; with random intercepts for participants to account for individual differences. This analysis yielded only an effect of group, suggesting that BDI and MMSE score were not related to performance.

TO DO: Keep eye on MMSE scores, approaching significance

Code
# Hierarchical frequentist regression with random intercept: Reward as function of BDI and MMSE score (all patients)    
lmer_performance_BDI_MMSE <- lmer(z ~ group + BDI + MMSE + (1 | id), 
                                  data = subset(dat, trial > 0 & round %in% 2:9))


# addtionally including demographics
# lmer_performance <- lmer(z ~ gender + age + round +group + BDI + MMSE + (1 | id), 
#                                   data = subset(dat, trial > 0 & round %in% 2:9))
# summary(lmer_performance)

#summary(lmer_reward_BDI_MMSE)
tab_model(lmer_performance_BDI_MMSE, title = "Hierarchical regression results: Performance as function of BDI and MMSE score.", bpe="mean")
Hierarchical regression results: Performance as function of BDI and MMSE score.
  z
Predictors Estimates CI p
(Intercept) 0.27 -0.20 – 0.75 0.264
group [PD on] -0.05 -0.08 – -0.01 0.005
group [PD off] -0.12 -0.15 – -0.09 <0.001
BDI 0.00 -0.00 – 0.00 0.869
MMSE 0.01 -0.00 – 0.03 0.100
Random Effects
σ2 0.06
τ00 id 0.00
ICC 0.06
N id 97
Observations 19400
Marginal R2 / Conditional R2 0.037 / 0.094
Code
# Hierarchical Bayesian regression with random intercept: Reward as function of BDI and MMSE score (all patients)                              
brm_performance_BDI_MMSE <- run_model(brm(z ~ group + BDI + MMSE + (1|id),
                                          data=subset(dat, trial > 0 & round %in% 2:9 ),
                                          chains = 4,                               
                                          cores = 4,                                
                                          save_pars = save_pars(all = TRUE),
                                          seed = 0815,
                                          iter = 5000,
                                          warmup=1000,
                                          backend = "cmdstanr",
                                          control = list(adapt_delta = 0.99, max_treedepth = 15)),
                                      #prior = prior(normal(0,10), class = "b")),
                                      modelName = 'brm_performance_BDI_MMSE')
#tab_model(brm_performance_assessment, bpe="mean", title = "Hierarchical Bayesian regression: Performance as function of BDI and MMSE score.") 
#bayes_R2(brm_performance_assessment) 
#tab_model(lmer_performance_BDI_MMSE, brm_performance_BDI_MMSE, title = "Hierarchical regression results: Performance as function of BDI and MMSE score.", bpe="mean")

Next, we ran a hierarchical regression for Parkinson’s patients only, with reward as dependent variable and group, BDI, MMSE, and Hoehn-Yahr score as predictors; with random intercepts for participants to account for individual differences. This analysis only yielded an influence of group, i.e. being on or off L-Dopa.

Code
# Hierarchical frequentist regression with random intercept: Reward as function of BDI, MMSE, and Hoehner-Yahr score (Parkinson's patients only)   
lmer_reward_PD_only_BDI_MMSE_HY <- lmer(z ~ group + BDI + MMSE + hoehn_yahr + (1 | id), 
                                        data = subset(dat, trial > 0 & round %in% 2:9 & group != "Control"))

#summary(lmer_reward_PD_only_BDI_MMSE_HY)

tab_model(lmer_reward_PD_only_BDI_MMSE_HY, title = "Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.",  bpe="mean")
Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.
  z
Predictors Estimates CI p
(Intercept) 0.45 -0.07 – 0.98 0.093
group [PD off] -0.08 -0.10 – -0.05 <0.001
BDI 0.00 -0.00 – 0.01 0.157
MMSE 0.00 -0.01 – 0.02 0.607
hoehn yahr 0.01 -0.01 – 0.03 0.456
Random Effects
σ2 0.06
τ00 id 0.00
ICC 0.04
N id 64
Observations 12800
Marginal R2 / Conditional R2 0.025 / 0.063
Code
# Hierarchical Bayesian regression with random intercept: Reward as function of BDI, MMSE, and Hoehner-Yahr score (Parkinson's patients only)                          
brm_performance_PD_only_BDI_MMSE_HY <- run_model(brm(z ~ group + BDI + MMSE + (1|id),
                                                     data=subset(dat, trial > 0 & round %in% 2:9 & group != "Control"),
                                                     cores=4,
                                                     chains = 4,
                                                     seed = 0815,
                                                     iter = 5000,
                                                     warmup=1000,
                                                     backend = "cmdstanr",
                                                     control = list(adapt_delta = 0.99)),
                                                 #prior = prior(normal(0,10), class = "b")),
                                                 modelName = 'brm_performance_PD_only_BDI_MMSE_HY')

# tab_model(lmer_reward_PD_only_BDI_MMSE_HY, brm_performance_PD_only_BDI_MMSE_HYtitle = "Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.",  bpe="mean")

4.4 Exploration vs. exploitation choices

To investigate the temporal dynamics of exploration and exploitation, we determined for each trial whether the chosen tile was novel (an exploration decision) or had already been selected previously (an exploitation decision). Intuitively, at the beginning of each round learners should predominantly engage in exploration to identify high-reward options, and gradually shift toward exploitative behavior as they approach the end of the round.

Code
# proportion of unique choices per round per subject
df_unique_choices_round <- 
  dat %>%
  filter(round %in% 2:9 & trial > 0) %>% 
  group_by(id,group, round) %>%  
  summarize(
    total = n(),  #  number of trials
    unique_tiles = n_distinct(x, y),  # unique (x, y) combinations (i.e., tiles)
    repeat_tiles = total - unique_tiles
  ) %>% 
  mutate(prop_unique = unique_tiles/total,
         prop_repeat = repeat_tiles/total) 

# proportion of unique choices across 8 rounds per subject
df_unique_choices_subject <- df_unique_choices_round %>% 
  group_by(id, group) %>% 
  summarize(m_prop_unique = mean(prop_unique),
            m_prop_repeat = mean(prop_repeat))

dat <- dat %>%
  group_by(id, round) %>%
  arrange(trial, .by_group = TRUE) %>%  # Ensure data is sorted by trial
  mutate(
    is_new = factor(if_else(!duplicated(chosen), "new", "repeat")),  # Check uniqueness based on 'chosen' column
    is_new_label = factor(if_else(!duplicated(chosen), "Exploration", "Exploitation"), levels = c("Exploration", "Exploitation"))
  ) %>%
  ungroup()

dat_repeat_prop <- dat %>%
  filter(trial > 0 & round %in% 2:9) %>% 
  group_by(id, group, trial) %>%
  summarize(
    prop_repeat = mean(is_new == "repeat", na.rm = TRUE)  # Calculate proportion of "repeat" (exploitation) choices
    # prop_new = mean(is_new == "new", na.rm = TRUE)  # Calculate proportion of "new" (exporation) choices
  )

We next examined choice behavior, focusing on the trade-off between exploration and exploitation. Figure 4 shows that differences in reward accumulation are driven by learners’ ability to adequately balance exploration (selecting novel options) and exploitation (re-clicking tiles with known high rewards). PD patients tested on medication exploited more than patients off medication (\(t(62)=5.0\), \(p<.001\), \(d=1.3\), \(BF>100\)), whose exploitation levels were notably low.

4.5 Exploitation choices

Controls exploited more than PD patients (\(t(65)=2.6\), \(p=.011\), \(d=0.6\), \(BF=4.2\)), but the difference was less pronounced than the comparison of the PD groups, aligning with their slightly higher levels of reward accumulation.

Code
p_explore_exploit_choices <-  ggplot(df_unique_choices_subject, aes(x = group, y = m_prop_repeat, color = group, fill = group, shape = group)) +
  geom_boxplot(alpha = 0.2, width = 0.4, outlier.shape = NA) +  
  geom_jitter(width = 0.15, size = 2) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 4) +
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  scale_y_continuous("Exploitation choices", 
                     breaks = c(0, 0.5, 1), 
                     labels = percent_format(accuracy = 1)) +
  coord_cartesian(ylim = c(0, 1)) +
  xlab("") +
  ggtitle("Exploitation choices") +
  theme_classic() +
  theme(
    strip.background = element_blank(),  
    strip.text = element_text(color = "black", size = 12),
    legend.position = "none",
    axis.title = element_text(size = 18),
    axis.text = element_text(size = 18),
    plot.margin = margin(0, 0, 0, 0),
    plot.title = element_text(size = 24)
  )

p_explore_exploit_choices
ggsave("plots/explore_exploit_choices.png", p_explore_exploit_choices, dpi=300, width = 6, height = 5)
Figure 4: Balancing exploration and exploitation. Shown are the mean proportions of exploitation decisions, aggregated over trials and rounds. Each dot is one participant.

Analysis of the temporal dynamics of exploration and exploitation shows that both controls and PD patients on medication increased the amount of exploitation over time, indicating a goal-directed shift from exploring novel options to exploiting known high-value options (Figure 5). Controls began exploiting earlier in the round and exhibited a stronger overall tendency toward exploitation compared to patients on medication, indicating that this earlier focus on exploitation underlies their better performance. In stark contrast, PD patients off medication patients predominantly engaged in exploration and showed only a weak tendency towards exploitation as the search horizon approached its end.

Code
# Main plot
p_explore_exploit_choices_over_time <- ggplot(dat_repeat_prop, aes(x = trial, y = prop_repeat, fill = group, shape = group, color = group)) +
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "point", size = 3, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
  scale_fill_manual(values = groupcolors) + 
  scale_color_manual(values = groupcolors)+ 
  scale_y_continuous(labels = percent_format(accuracy = 1)) + 
  labs(
    x = "Trial",
    y = "Exploitation choices (M±95% CI)",
    title = "Exploration and exploitation over time"
  ) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        # legend.position = c(0.01, 0.4),  
        legend.position = "inside",  
        legend.position.inside = c(0.05, 0.95),  
        legend.justification = c(0, 1),
        legend.title = element_blank(),
        axis.title = element_text(colour="black",size = 18),
        axis.text = element_text(colour="black",size = 18),
        plot.margin = margin(0, 0, 0, 0),
        plot.title = element_text(size = 24),
        legend.text = element_text(colour="black", size=18)
  )


p_explore_exploit_choices_over_time
ggsave("plots/explore_exploit_choices_over_time.png", p_explore_exploit_choices_over_time, dpi=300, width = 8, height = 4.5)
Figure 5: Mean proportion of exploitation decisions per trial, aggregated over rounds.
Code
# Main plot
p_main <- ggplot(dat_repeat_prop, aes(x = trial, y = prop_repeat, fill = group, shape = group, color = group)) +
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "point", size = 3, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
  scale_fill_manual(values = groupcolors) + 
  scale_color_manual(values = groupcolors)+ 
  scale_y_continuous(labels = percent_format(accuracy = 1)) + 
  labs(
    x = "Trial",
    y = "Proportion exploitation  (M±95% CI)",
    title = "Exploration and exploitation over time"
  ) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.position = c(0.01, 0.4),  
        legend.justification = c(0, 1),
        legend.title = element_blank(),
        axis.title = element_text(size = 20),
        axis.text = element_text(size = 20),
        plot.margin = margin(0, 0, 0, 0),
        plot.title = element_text(size = 24),
        legend.text = element_text(colour="black", size=18)
  )

# Inset plot
p_inset <-  ggplot(df_unique_choices_subject, aes(x = group, y = m_prop_repeat, color = group, fill = group, shape = group)) +
  geom_boxplot(alpha = 0.2, size = 0.5, outlier.shape = NA) +  
  geom_jitter(width = 0.15, size = 0.8) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  scale_y_continuous("", 
                     breaks = c(0, 0.5, 1), 
                     labels = percent_format(accuracy = 1)) +
  coord_cartesian(ylim = c(0, 1.25)) +
  xlab("") +
  ggtitle("   Total exploitation choices") +
  theme_classic() +
  theme(
    strip.background = element_blank(),  
    strip.text = element_text(color = "black", size = 12),
    legend.position = "none",
    axis.title.y = element_text(size = 14),
    axis.text.y = element_text(size = 14),
    plot.margin = margin(0, 0, 0, 0),
    plot.title = element_text(size = 14, margin = margin(b = -10))
  )


# Combine plots
p_explore_exploit_choices_inset <- ggdraw() +
  draw_plot(p_main) +
  draw_plot(p_inset, x = 0.14, y = 0.53, width = 0.35, height = 0.35)

# p_explore_exploit_choices_inset

# ggsave("plots/explore_exploit_choices_combined.png", dpi=300, width = 8, height = 4.5)

# ggboxplot(df_unique_choices_subject, 
#                    x = "group", 
#                    # y = "m_prop_unique",
#                    y = "m_prop_repeat",
#                    color = "group", palette = groupcolors, fill = "group", alpha = 0.2, size=0.2,
#                    add = "jitter", jitter.size = 0.0, shape = "group", title = "   Total exploitation choices") +
# scale_y_continuous("", , #"Exploration choices", 
#                    breaks = c(0, 0.5, 1), 
#                    # limits = c(0,1),
#                    labels = percent_format(accuracy = 1)) +
# coord_cartesian(ylim=c(0,1.25)) +
# xlab("") +
# ggtitle("") +
# stat_compare_means(comparisons = list( c("Control", "PD on"), c("PD on", "PD off")  ),
#                    paired = FALSE, 
#                    method = "t.test", 
#                    aes(label = paste0("p = ", after_stat(p.format)))  ) +
# stat_summary(fun = mean, geom="point", shape = 23, fill = "white", size=2) +
# theme_classic() +
# theme(strip.background = element_blank(),  
#       strip.text = element_text(color = "black", size=12),
#       legend.position = "none",
#       axis.title.y = element_text(size=8),
#       axis.text.y = element_text(size=7),
#       plot.margin = margin(0, 0, 0, 0),
#       plot.title = element_text(size = 9, margin = margin(b = -10)), 
# )

# Plot for Computational Psychiatry Conference (Tübingen, July 2025)
# Main plot
# p_main_CPP <- ggplot(dat_repeat_prop, aes(x = trial, y = prop_repeat, fill = group, shape = group, color = group)) +
#   stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, position=position_dodge(width=0.5)) +  
#   stat_summary(fun = mean, geom = "point", size = 3, position=position_dodge(width=0.5)) +  
#   stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
#   scale_fill_manual(values = groupcolors) + 
#   scale_color_manual(values = groupcolors)+ 
#   scale_y_continuous(labels = percent_format(accuracy = 1)) + 
#   labs(
#     x = "Trial",
#     y = "Exploitation choices (M±95% CI)",
#     title = "Exploration and exploitation over time"
#   ) +
#   theme_classic() +
#   theme(strip.background = element_blank(),  
#         strip.text = element_text(color = "black", size=12),
#         legend.position = c(0.01, 0.4),  
#         legend.justification = c(0, 1),
#         legend.title = element_blank(),
#         axis.title = element_text(size = 20),
#         axis.text = element_text(size = 20),
#         plot.margin = margin(0, 0, 0, 0),
#         plot.title = element_text(size = 24),
#         legend.text = element_text(colour="black", size=18)
#   )
# 
# # Inset plot
# p_inset_CPP <-  ggplot(df_unique_choices_subject, aes(x = group, y = m_prop_repeat, color = group, fill = group, shape = group)) +
#   geom_boxplot(alpha = 0.2, size = 0.5, outlier.shape = NA, width=0.75) +  
#   geom_jitter(width = 0.15, size = 0.8) +  
#   stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +
#   scale_color_manual(values = groupcolors) +
#   scale_fill_manual(values = groupcolors) +
#   scale_y_continuous("", 
#                      breaks = c(0, 0.5, 1), 
#                      labels = percent_format(accuracy = 1)) +
#   coord_cartesian(ylim = c(0, 1.15)) +
#   xlab("") +
#   ggtitle("  Exploitation choices") +
#   theme_classic() +
#   theme(
#     strip.background = element_blank(),  
#     strip.text = element_text(color = "black", size = 12),
#     legend.position = "none",
#     axis.title.y = element_text(size = 14),
#     axis.text.y = element_text(size = 14),
#     plot.margin = margin(0, 0, 0, 0),
#     plot.title = element_text(size = 14, margin = margin(b = -10))
#   )
# 
# 
# # Combine plots
# p_explore_exploit_choices_CPP <- ggdraw() +
#   draw_plot(p_main_CPP) +
#   draw_plot(p_inset_CPP, x = 0.14, y = 0.53, width = 0.35, height = 0.35)
# 
# ggsave("plots/explore_exploit_choices_CPP.png", dpi=300, width = 7, height = 5)
# ggsave("plots/explore_exploit_choices_CPP.pdf", width = 7, height = 5)

The mean proportion of repeat choices (=exploitation) differed among all groups, with patients with polyneuropathy (Control) showing higher levels of exploitation than both PD on and PD off patients. Parkinson patients on medication (PD on) exploited more than patients off medication (PD off).

  • Control vs. PD on: \(t(65)=2.6\), \(p=.011\), \(d=0.6\), \(BF=4.2\)
  • Control vs. PD off: \(t(63)=6.5\), \(p<.001\), \(d=1.6\), \(BF>100\)
  • PD on vs. PD off: \(t(62)=5.0\), \(p<.001\), \(d=1.3\), \(BF>100\)

4.5.1 Rewards obtained from exploration and exploitation

We also calculated the mean reward obtained from learners’ explore and exploit choices, respectively. Figure 6 shows that both controls and PD on patients obtained higher rewards from their exploration choices, indicating higher levels of adaptation to the structure of the environment. Similarly, for exploitative choices, PD on and Control patients obtained higher rewards, showing that PD off patients not only exploited less frequently but also did so less efficiently.

Code
# mean reward 
df_reward_explore_exploit <- dat %>%
  filter(round %in% 2:9 & trial > 0) %>% 
  # group_by(id,group, round, is_new) %>%  
  group_by(id,group, is_new_label) %>%  
  summarize(
    n = n(),  #  number of trials
    mean_reward = mean(z)
  )

ggplot(df_reward_explore_exploit, aes(x = group, y = mean_reward, color = group, fill = group, shape = group)) +
  facet_wrap(~is_new_label) + 
  geom_boxplot(alpha = 0.2, size = 0.5, outlier.shape = NA, width=0.75) +  
  geom_jitter(width = 0.15, size = 0.8) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  scale_y_continuous("Mean reward") + 
                   #breaks = c(0, 10, 20, 30, 40, 50), 
                   #labels = c("0","10","20","30","40","50"),
                   #limits = c(0, 55)) +
  xlab("") +
  ggtitle("Mean rewards for exploration and exploitation") +
  theme_classic() +
  theme(
    strip.background = element_blank(),  
    strip.text = element_text(color = "black", size = 14),
    legend.position = "none",
    axis.title = element_text(size = 14),
    axis.text = element_text(size = 14),
    # plot.margin = margin(0, 0, 0, 0),
    plot.title = element_text(size = 16)
  )

# ggsave("plots/explore_exploit_choices_rewards.png", dpi=300, width = 7, height = 5)
# ggsave("plots/explore_exploit_choices_rewards.pdf", width = 7, height = 5)
Figure 6: Mean rewards for explore versus exploit choices, averaged across all rounds and trials per participant. Each dot represents one participant.

Explore choices

  • Control vs. PD on: \(t(65)=1.0\), \(p=.311\), \(d=0.2\), \(BF=.39\)
  • Control vs. PD off: \(t(63)=3.4\), \(p=.001\), \(d=0.8\), \(BF=24\)
  • PD on vs. PD off: \(t(62)=2.9\), \(p=.005\), \(d=0.7\), \(BF=7.6\)

Exploit choices

  • Control vs. PD on: \(t(60)=0.3\), \(p=.738\), \(d=0.1\), \(BF=.27\)
  • Control vs. PD off: \(t(49)=3.4\), \(p=.001\), \(d=1.0\), \(BF=24\)
  • PD on vs. PD off: \(t(47)=3.2\), \(p=.002\), \(d=0.9\), \(BF=16\)

4.6 Spatial trajectories

4.6.1 Distance consecutive choices

We next consider participant’s spatial search trajectories (distance among consecutive clicks). Distance is measured as Manhattan distance between consecutive clicks, such that repeat clicks have distance 0, clicking directly neighbouring tiles has distance 1, and clicks further away have distances >1.

The most frequent choice was to select a neighboring tile (distance = 1), reflecting a local search approach (Wu et al., 2018). On average, Control patients had the shortest distances, indicating more local searches and repeated clicks. PD on patients had greater distances than Control but shorter than PD off patients, who showed the highest distances. The distribution of distances shows that this is primarily due to the few repeat choices (distance = 0) they made, i.e. very limited exploitation behavior.

Figure 7: Distance among consecutive clicks.
Figure 8: Distance among consecutive clicks.

4.6.2 Search distances: All choices

  • Control vs. PD+ patients on medication: \(t(65)=-2.5\), \(p=.015\), \(d=0.6\), \(BF=3.4\))
  • Control vs. PD- off medication: \(t(63)=-2.3\), \(p=.024\), \(d=0.6\), \(BF=2.3\)
  • PD+ on medication vs. PD- off medication: \(t(62)=-0.4\), \(p=.661\), \(d=0.1\), \(BF=.28\)
Code
#  non-repeat decisions only: distance by  group
df_mean_distance_subject_exploration <- 
  dat %>%
  filter(trial > 0 & round %in% 2:9) %>%
  filter(distance > 0) %>% # exclude repeat (iimediate exploitation)
  group_by(id, group) %>%
  summarise(decisions=n(),
            mean_distance = mean(distance, na.rm = TRUE),
            median_distance = median(distance),
            SD_distance = sd(distance),            
            SE_distance = SD_distance / sqrt(decisions)) 

df_mean_distance_subject_exploration %>%
  group_by(group) %>%
  summarise(n=n(),
            mean_distance = mean(mean_distance, na.rm = TRUE),
            SD_mean_distance = sd(mean_distance, na.rm = TRUE)) %>% 
    kable()
group n mean_distance SD_mean_distance
Control 34 2.404981
PD on 33 2.404382
PD off 31 2.187738
Code
df_mean_distance_subject_exploration %>%
  group_by(group) %>%
  summarise(
    n                = n(),
    mean_distance    = mean(mean_distance, na.rm = TRUE),
    SD_mean_distance = ifelse(n > 1,
                              sd(mean_distance, na.rm = TRUE),
                              0),
    SE_mean_distance = SD_mean_distance / sqrt(n),
    .groups = "drop"
  )
# A tibble: 3 × 5
  group       n mean_distance SD_mean_distance SE_mean_distance
  <fct>   <int>         <dbl>            <dbl>            <dbl>
1 Control    34          2.40               NA               NA
2 PD on      33          2.40               NA               NA
3 PD off     31          2.19               NA               NA

4.6.3 Search distances: Only exploration choices

For this analysis, exploratory choices are defined as consecutive selections in which the previously chosen tile is not clicked again (distance = 0), but a different tile farther away is selected (distance > 0).

  • Control vs. PD+ patients on medication: \(t(65)=0.0\), \(p=.996\), \(d=0.0\), \(BF=.25\)
  • Control vs. PD- off medication: \(t(63)=1.4\), \(p=.171\), \(d=0.3\), \(BF=.57\)
  • PD+ on medication vs. PD- off medication: \(t(62)=1.5\), \(p=.132\), \(d=0.4\), \(BF=.68\)

4.6.4 Types of choices

We can also categorize each consecutive click as “repeat” (clicking the same tile as in the previous round), “near” (clicking a directly neighboring tile, i.e. distance=1), or “far” (clicking a tile with distance > 1). We first computed for each participant the proportion of type of choices across all 8 rounds x 25 clicks = 200 search decisions and then plot the mean proportion for each group.

The analyses reveal distinct search patterns across patient groups. Control participants had the highest proportion of repeat (exploit) decisions, followed by the PD on group. The proportion of repeat decisions in the PD off group was minimal. These behaviors help explain the differences in learning curves, where Control patients showed the most significant improvement, followed by PD on patients. In contrast, PD off patients exhibited no improvement across trials, due to their lower tendency to exploit high-reward options.

Code
df_types_choices_subject <- dat %>% 
  filter(round %in% 2:9 & trial > 0) %>% 
  group_by(id, group, type_choice) %>% 
  summarise(n = n()) %>% 
  complete(type_choice, fill = list(n = 0)) %>% # turn implicit missing values into explicit missing values
  group_by(id, group) %>%
  mutate(prop = n / sum(n)) 

df_types_choices_overall <- df_types_choices_subject %>% 
  group_by(group, type_choice) %>% 
  summarise(n = n(),
            mean_prop = mean(prop),
            SD_prop = sd(prop),
            se_prop = SD_prop / sqrt(n),
            lower_ci_prop = mean_prop - qt(1 - (0.05 / 2), n - 1) * se_prop,
            upper_ci_prop = mean_prop + qt(1 - (0.05 / 2), n - 1) * se_prop)

 p_types_choices_overall <- 
  ggplot(df_types_choices_overall) +
  facet_grid(~group) +
  geom_bar(aes(x = type_choice, y = mean_prop, fill = group, alpha = type_choice), stat = "identity", colour = 'black', width = .9, position=position_dodge2(padding=0.1)) +
  scale_y_continuous("Average proportion", limits = c(0,0.7), breaks = seq(0,1,.2), expand = c(0, 0), labels =   scales::percent_format(accuracy=1)) +
  scale_alpha_discrete(range = c(0.2,1)) +
  scale_x_discrete("Type of search decision") +
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  ggtitle("Consecutive search decisions") +
  geom_text(aes(x = type_choice, y = mean_prop, label = scales::percent(mean_prop, accuracy = 1)), vjust = -0.5, size = 6) + 
  theme_classic() +
  theme(#aspect.ratio = 1,
        plot.title = element_text(size=24),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background = element_blank(),
        strip.text = element_text(size = 18),
        axis.text = element_text(colour="black", size=18),
        axis.title = element_text(colour="black", size=18),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())  

# ggsave("plots/distance_types.png", p_types_choices_overall, dpi=300,height=3, width=5)

An analysis of consecutive choice types over time reveals clear differences in search behavior between the groups. Both Control and PD on patients adapt their strategies as the round progresses by decreasing the number of local (distance = 1) and far (distance > 1) choices, while increasing the number of exploit decisions, indicating a shift from exploration to exploitation. Notably, the data indicate a faster shift to exploitation for Control patients compared to PD on patients, with an earlier and stronger preference for re-selecting known high-reward options. In contrast, PD off patients show limited adaptation, with the proportions of each decision type remaining relatively stable throughout the round, aside from a slight increase in exploit decisions.

Code
df_types_choices_trial_subject <- dat %>%
  filter(round %in% 2:9 & trial > 0) %>%
  group_by(id, group, trial, type_choice) %>%
  summarise(n = n()) %>%
  complete(type_choice, fill = list(n = 0)) %>%  # 
  group_by(id, group, trial) %>%
  mutate(prop = n / sum(n)) %>%  
  ungroup()


ggplot(df_types_choices_trial_subject, aes(x = trial, y = prop, color = type_choice, group = type_choice)) +
  facet_wrap(~group) +  
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, , position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "point", size = 2, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
  scale_fill_manual(values = choice3_colors) + 
  scale_color_manual(values = choice3_colors)+ 
  labs(
    x = "Trial",
    y = "Mean proportion (±95% CI)",
    title = "Types of choices over time",
    color = "Type choice"
  ) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.position = "inside", 
        legend.position.inside = c(0.15, 1),   
        legend.justification = c(1, 1),
        legend.title = element_blank(),
        legend.spacing.y = unit(0.05, 'cm'), 
        legend.key.height = unit(0.3, 'cm')  
  )    

Types of search decisions over time.

Types of search decisions over time.
Code
# ggsave("plots/types_choice_by_trial.png", dpi=300, width = 8, height = 4)

4.6.5 Distance as function of previous reward

Finally, we analysed the relation between the value of a reward obtained at time \(t\) and the search distance on the subsequent trial \(t+1\). If a large reward was obtained, searchers should search more locally, while conversely, if a low reward was obtained, searchers should be more likely to search farther away.

Across all trials and rounds, search distance and previous reward were negatively correlated, indicating that participants tended to search further away following lower rewards compared to higher rewards. This relationship was stronger in Control patients (\(r=-.43\), \(p<.001\), \(BF>100\)) and PD on patients (\(r=-.35\), \(p<.001\), \(BF>100\)) compared to PD off patients (\(r=-.17\), \(p<.001\), \(BF>100\)). These findings suggest that PD patients off medication exhibited less adaptive search behavior than those on medication and individuals with polyneuropathies.

Code
# correlation of previous reward and distance of consecutive choices, by age group and environment
# overall, ignoring within-subject factor
# dat %>% 
#   filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
#   group_by(group) %>% 
#   summarise(corTestPretty(previous_reward, distance))

# mean correlation between distance and reward obtained on previous step
# first aggregated within each round and then within each subject
# such that there is one correlation for each subject

# reward_distance_cor <- dat %>% 
#   filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
#   group_by(id, round, group) %>% 
#   summarise(cor = cor(previous_reward, distance)) %>% 
#   mutate(cor = replace_na(cor, 0)) %>%  # in some rounds subjects clicked the same tile throughout; set cor=0
#   ungroup() %>% 
#   group_by(id, group) %>% 
#   summarise(mean_cor = mean(cor))

# mean correlation between distance and reward obtained on previous step as function of group
# reward_distance_cor %>% 
#   group_by(group) %>% 
#   summarise(n = n(),
#             m_cor = mean(mean_cor),
#             SD_cor = sd(mean_cor),
#             se_cor = SD_cor / sqrt(n),
#             lower_ci_cor = m_cor - qt(1 - (0.05 / 2), n - 1) * se_cor,
#             upper_ci_cor = m_cor + qt(1 - (0.05 / 2), n - 1) * se_cor)

#plot regression lines based on raw data
# ggplot(subset(dat, trial > 0 & round %in% 2:9), aes(x = previous_reward, y = distance, color = group)) +
#   facet_wrap(~group) +  
#   geom_jitter(alpha = 0.3, width = 0.1, height = 0.1) +  
#   geom_smooth(method = "lm", formula = y ~ x, se = TRUE) +  
#   
#   ggtitle("Regression Lines for Distance by Previous Reward and Group") +
#   theme_minimal() +
#   xlab("Previous Reward") +
#   ylab("Distance")

Given the nested structure of the data, we next employed a Bayesian hierarchical regression analysis to predict search distance based on the reward obtained in the previous step, with group and their interaction as population-level (fixed) effects and subject-wise random intercepts. These analyses show that both the magnitude of reward obtained on the last step and group influence search distance. Notably, PD patients off medication (PD off) adapted their search behavior less in response to reward magnitude, while patients on medication (PD on) exhibited adaptation levels close to to the Control group.

Code
# for now, random intercepts only, Random intercept + random slope not stable
# lmer_distance_reward <- lmer(distance ~ previous_reward * group + (previous_reward + group | id), 
# data = subset(dat, trial > 0 & round %in% 2:9))
# fit model
lmer_distance_reward <- lmer(distance ~ previous_reward * group + (1 | id), 
                             data = subset(dat, trial > 0 & round %in% 2:9))

#summary(lmer_distance_reward)
#emmeans(lmer_distance_reward, pairwise ~ previous_reward | group, pbkrtest.limit = 15000)

p_lmer_distance_reward <- plot_model(lmer_distance_reward, type = "pred", terms = c("previous_reward", "group")) +
  stat_summary(dat, mapping=aes(x=previous_reward, y=distance, color=group, fill=group,shape = group), fun=mean, geom='point', alpha=0.7, size=1, na.rm = TRUE)+
  # scale_x_continuous('Previous Reward', breaks = (c(0,10,20,30,40,50))) +
  # scale_x_continuous('Previous Reward', breaks = (c(0,0.210,20,30,40,50))) +
  ylab('Distance to Next Option')+
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  ggtitle('Search distance ~ previous reward (lmer)') +
  theme_classic() +
  theme(legend.position = "inside", 
        legend.position.inside = c(0.85, 0.9),   # Use legend.position.inside
        legend.justification = c(1, 1),
        legend.title = element_blank(),
        legend.box.background =  element_blank(),
        legend.key = element_rect(fill = "white"),
         axis.text = element_text(colour = "black", size = 14),
    axis.title = element_text(colour = "black", size = 14),) +
  guides(color = guide_legend(
    override.aes = list(
      fill = NA, 
      size = 2
    )))

# p_lmer_distance_reward$layers[[2]]$show.legend <- FALSE
# p_lmer_distance_reward

# ggsave("plots/regression_distance_reward_lmer.png", p_lmer_distance_reward, dpi=300, height=5, width=7)
Code
# Bayesian regression analysis
# run_model() is a wrapper for brm models such that it saves the full model the first time it is run, otherwise it loads it from disk from directory `~brm`
# Fixed effects: previous_reward and group.
# Random effects: random slopes and a random intercept for both previous_reward and group by id, i.e., the effect of previous_reward and group can vary across individuals (id).

# random intercept and random slope
# brm_distance_reward <- run_model(brm(distance ~ previous_reward * group + (previous_reward + group | id), 

# random intercept                                     
brm_distance_reward <- run_model(brm(distance ~ previous_reward * group + (1|id),
                                     data=subset(dat, trial > 0 & round %in% 2:9 ),
                                     cores=4,
                                     chains=4,
                                     backend = "cmdstanr",
                                     seed = 0815,
                                     iter = 5000,
                                     warmup=1000,
                                     control = list(adapt_delta = 0.99, max_treedepth = 15)),
                                 #prior = prior(normal(0,10), class = "b")),
                                 modelName = 'brm_distance_reward')
#tab_model(brm_distance_reward, bpe="mean", title = "Bayesian regression results: Search distance as function of reward on previous step.") 
#bayes_R2(brm_distance_reward) 

# generate plot manually  predictions (otherwise difficult to plot the mean empirical values per geom_point)
# prevReward <-  seq(-3,55) / 50 # normalized reward

prevReward <-  seq(round(min(dat$previous_reward, na.rm=T),1),round(max(dat$previous_reward, na.rm=T),1), 0.1) # normalized reward

group  <-  levels(dat$group)
newdat <-  expand.grid(previous_reward=prevReward, group=group)

# predict distance based on previous reward
preds <-  fitted(brm_distance_reward, re_formula=NA, newdata=newdat, probs=c(.025, .975))
predsDF <-  data.frame(previous_reward=rep(prevReward, 3),
                       group=rep(levels(dat$group), each=length(prevReward)),
                       distance=preds[,1],
                       lower=preds[,3],
                       upper=preds[,4])

# average distance
grid  <-  expand.grid(x1=0:7, x2=0:7, y1=0:7, y2=0:7)
grid$distance <-  NA

for(i in 1:dim(grid)[1]){
  grid$distance[i] <- dist(rbind(c(grid$x1[i], grid$x2[i]), c(grid$y1[i], grid$y2[i])), method = "manhattan")
}

meanDist  <-  mean(grid$distance)

# plot predictions
p_regression_distance_reward <- 
  ggplot() +
  stat_summary(dat, mapping=aes(x=previous_reward, y=distance, color=group, fill=group), fun=mean, geom='point', alpha=0.7, size=1, na.rm=T)+
  geom_line(predsDF, mapping=aes(x=previous_reward, y=distance, color=group), linewidth=1) +
  geom_ribbon(predsDF, mapping=aes(x=previous_reward, y=distance, ymin=lower, ymax=upper, fill=group), alpha=.3) +
  #geom_hline(yintercept=meanDist, linetype='dashed', color='red') + # mean distance
  # xlab('Normalized Previous Reward')+
  coord_cartesian(ylim= c(0,5), xlim= c(-0.01,1)) +
  scale_x_continuous(
    name = "Previous normalized reward",
    breaks = seq(0, 1, 0.1),
    labels = sprintf("%.1f", seq(0, 1, 0.1))) +
  ylab('Distance to next chosen option')+
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  labs(
  title = "Search distance as function of previous reward",
  #subtitle = "(Bayesian hierarchical regression)"
  ) +
  theme_classic() +
  theme(legend.position = "inside", 
        legend.position.inside = c(0.85, 0.9),   
        legend.justification = c(1, 1),
        legend.title = element_blank(),
        legend.text = element_text(colour = "black", size = 18),
        plot.title = element_text(colour = "black", size = 24),
        plot.subtitle = element_text(colour = "black", size = 18),
        axis.text = element_text(colour = "black", size = 18),
        axis.title = element_text(colour = "black", size = 18)
        )        

p_regression_distance_reward
ggsave("plots/regression_distance_reward_brms.png", p_regression_distance_reward, dpi=300, height=5, width=7)
# ggsave("plots/regression_distance_reward_brms.pdf", p_regression_distance_reward, height=5, width=7)

# mean values vy previosu reward and group

# df_summary <- dat %>%
#   group_by(previous_reward, group) %>%
#   summarise(
#     mean_distance = mean(distance, na.rm = TRUE),
#     .groups = "drop"
#   )

# Plot for Computational Psychiatry Conference (Tübingen, July 2025)
# ggplot() +
#   stat_summary(dat, mapping=aes(x=previous_reward, y=distance, color=group, fill=group), fun=mean, geom='point', alpha=0.7, size=1, na.rm=T)+
#   geom_line(predsDF, mapping=aes(x=previous_reward, y=distance, color=group), linewidth=1) +
#   geom_ribbon(predsDF, mapping=aes(x=previous_reward, y=distance, ymin=lower, ymax=upper, fill=group), alpha=.3) +
#   #geom_hline(yintercept=meanDist, linetype='dashed', color='red') + # mean distance
#   scale_x_continuous('Previous Reward', breaks = seq(0,50,10), labels = c(0,10,20,30,40,50)) +
#   scale_y_continuous('Distance to next choice', breaks = seq(0,7,1), labels = c(0,1,2,3,4,5,6,7))+ 
#   ylab('Distance to next choice') +
#   coord_cartesian(ylim= c(0,5)) +
#   ylab('Distance to next choice')+
#   scale_fill_manual(values=groupcolors) +
#   scale_color_manual(values=groupcolors) +
#   labs(
#   title = "Search Distance ~ Previous Reward",
#   subtitle = "(Bayesian hierarchical regression)") +
#   theme_classic() +
#   theme(legend.position = "inside", 
#         legend.position.inside = c(0.85, 0.9),   
#         legend.justification = c(1, 1),
#         legend.title = element_blank(),
#         legend.text = element_text(colour = "black", size = 18),
#         plot.title = element_text(colour = "black", size = 22),
#         plot.subtitle = element_text(colour = "black", size = 18),
#         axis.text = element_text(colour = "black", size = 18),
#         axis.title = element_text(colour = "black", size = 18)
#         )          
# 
# 
# ggsave("plots/regression_distance_reward_brms_CPP.png", dpi=300, height=5, width=6)
# ggsave("plots/regression_distance_reward_brms_CPP.pdf", height=5, width=6)
Figure 9: Bayesian hierarchical regression analysis with search distance as function of previous reward. Dots are the empirical mean distances for each reward value, aggregated over participants, trials, and rounds.

4.7 Bonus round judgments

In the bonus round, participants made 15 search decisions and then predicted the rewards for 5 randomly chosen, previously unobserved tiles. Subsequently, they chose one of the five tiles and continued the round until the search horizon of 25 clicks was met.

4.7.1 Data structure

Data frame dat_bonus contains the following variables:

  • id: participant id
  • bonus_env_number: internal id of the bonus round environment
  • bonus_environment: recodes condition as Smooth (high spatial correlation)
  • x and y are the coordinates of the random tiles on the grid for whcih participants were asked to provide reward estimates
  • givenValue: participant reward judgment (scale 0-50)
  • howSecure: participant confidence for given reward judgment (scale 0-10)
  • chosen_x and chosen_y are the coordinates of the tile chose after making reward and confidence judgments for 5 random tiles
  • true_z is the ground truth, i.e. true expected reward of a tile
  • error is the absolute deviation between participants reward estimates (givenValue) and ground truth (true_z)
  • chosen is whether the option was chosen or not (participants chose one of the five options after estimating their value and confidence in their reward prediction)
Table 3: Bonus round data.
id bonus_env_number bonus_environment x y givenValue howSecure chosen_x chosen_y true_z chosen error group
111 38 Rough 5 6 0.40 5 7 3 0.33 not chosen 0.07 Control
111 38 Rough 2 7 0.52 4 7 3 0.32 not chosen 0.20 Control
111 38 Rough 7 3 0.32 5 7 3 0.77 chosen 0.45 Control
111 38 Rough 0 7 0.56 3 7 3 0.48 not chosen 0.08 Control
111 38 Rough 7 6 0.60 5 7 3 0.68 not chosen 0.08 Control
115 39 Rough 0 0 0.38 4 3 1 0.52 not chosen 0.14 Control

4.7.2 Prediction error

Figure 10 shows the mean absolute error between participants’ estimates and the true underlying expected reward. Compared to a random baseline, all groups performed better than chance level:

  • Control: \(t(32)=-15.1\), \(p<.001\), \(d=2.6\), \(BF>100\)
  • PD on: \(t(28)=-10.0\), \(p<.001\), \(d=1.9\), \(BF>100\)
  • PD off: \(t(25)=-8.4\), \(p<.001\), \(d=1.7\), \(BF>100\)

Controls had lower prediction error than PD off patients; no other group differences were observed.

  • Control vs. PD on: \(t(60)=-1.2\), \(p=.221\), \(d=0.3\), \(BF=.49\)
  • Control vs. PD off: \(t(57)=-2.6\), \(p=.012\), \(d=0.7\), \(BF=4.1\)
  • PD on vs. PD off: \(t(53)=-1.2\), \(p=.249\), \(d=0.3\), \(BF=.48\)
Figure 10: Prediction error of bonus round judgments. The red dotted line indicates a random baseline.

4.7.3 Prediction error and confidence

Aggregated across all judgments, there was no relationship between prediction error and confidence, with varying and inconsistent patterns in the different groups. The Control group showed a weak negative relation (i.e., lower confidence was associated with higher prediction error), the PD on group a weak positive trend (i.e., lower confidence was associated with better lower prediction error), and the PD− group showed no association. None of these patterns were strong or reliable, no relationship between prediction error and confidence.

  • Overall: \(r=-.03\), \(p=.589\), \(BF=.13\)
  • Control: \(r=-.15\), \(p=.052\), \(BF=1.1\)
  • PD on: \(r=.13\), \(p=.107\), \(BF=.67\)
  • PD off: \(r=-.03\), \(p=.710\), \(BF=.22\)
Code
# Across all judgments and participants, there was no systematic relation between confidence and prediction error:
# corTestPretty(dat_bonus$error, dat_bonus$howSecure, method = "pearson") 
# cor.test(dat_bonus$error, dat_bonus$howSecure, method = "pearson") 
# correlationBF(dat_bonus$error, dat_bonus$howSecure, method = "pearson")

A Bayesian regression with prediction error as dependent variable, and confidence and group and their interaction as population-level (“fixed”) effects, and a random intercept for participants yielded the same conclusions: in control patients, confidence and prediction error were weakly negatively related (i.e., lower confidence was associated with a higher error), whereas for the two Parkinson groups confidence ratings and prediction error were not related.

Code
brm_bonus_confidence_error_by_group <- run_model(brm(error ~ howSecure * group + (1|id), 
                                                     data=dat_bonus, 
                                                     cores=4,  
                                                     chains=4,
                                                     backend = "cmdstanr",
                                                     control = list(adapt_delta = 0.99),
                                                     iter = 5000,
                                                     warmup=1000, 
                                                     seed = 0815), 
                                                 modelName = 'brm_bonus_confidence_error_by_group')
#tab_model(brm_bonus_confidence_error_by_group, bpe = "mean", title = "Bayesian regression results: Prediction error and confidence") 
#bayes_R2(brm_bonus_confidence_error_by_group)

4.7.4 Analysis of selected tiles

To analyze selected and not-selected options, we first averaged the predicted reward and confidence of the not-chosen tiles within subjects, and then compared chosen and not chosen options. Selected tiles tended to have higher predicted rewards. Participants were not more confident in selected options, and selected tiles did not have a higher true reward than not selected tiles.

Code
# average not-chosen tiles within subjects first
df_chosen_overall <- dat_bonus %>% 
  group_by(id, chosen) %>% 
  summarise(m_givenValue = mean(givenValue),
            m_howSecure = mean(howSecure),
            m_true_z = mean(true_z))

df_chosen_group <- dat_bonus %>% 
  group_by(id, group, chosen) %>% 
  summarise(m_givenValue = mean(givenValue),
            m_howSecure = mean(howSecure),
            m_true_z = mean(true_z))

# wide format for paired t-tests of chosen vs. not chosen tiles
df_chosen_group_wide <- df_chosen_group  %>% 
  mutate(chosen = recode(chosen,
                         "not chosen" = "not_chosen")) %>% 
  select(id, chosen, m_givenValue)  %>% 
  pivot_wider(names_from = chosen, values_from = m_givenValue) 
  

# df_chosen_group %>%
#   group_by(group,chosen) %>%
#   summarise(predicted_reward = mean(m_givenValue),
#             confidence = mean(m_howSecure),
#             true_reward = mean(m_true_z)) %>%
#   kable(format = "html", escape = F, digits = 2) %>%
#   kable_styling("striped", full_width = F)

# plot reward predictions: chosen vs not-chosen options
p_bonusround_prediction_chosen <-
  ggplot(df_chosen_group, aes(x = chosen, y = m_givenValue, color = group, fill = group, shape = group)) +
  facet_wrap(~group) +
  geom_boxplot(alpha = 0.2, outlier.shape = NA, width = 0.4) +
  geom_jitter(position = position_jitter(width = 0.2), size = 1.5, alpha = 0.8) +
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 3) +
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  labs(title = "Predicted reward of chosen vs. not chosen options", y = "Predicted reward", x = "") +
  theme_classic() +
  theme(
    legend.position = 'none',
    legend.title = element_blank(),
      plot.title = element_text(size=24),
      axis.text = element_text(size=18),
      axis.title = element_text(size=18),
      strip.background = element_blank(),  
      strip.text = element_text(color = "black", size=18)
  )

p_bonusround_prediction_chosen
ggsave("plots/bonusround_chosen_not_chosen_options_predicted_reward.png", p_bonusround_prediction_chosen, dpi=300, width=9, height = 4)
Figure 11: Predicted reward of chosen vs. not chosen options in bonus round.

(tab-bonusround-chosen-reward?) shows the predicted rewards for chosen vs. not chosen options. In the Control group, selected tiles were associated with higher predicted rewards than those not selected. In both Parkinson’s groups, no such difference was observed.

  • Control: \(t(32)=2.9\), \(p=.007\), \(d=0.6\), \(BF=6.1\)
  • PD on: \(t(28)=1.8\), \(p=.089\), \(d=0.3\), \(BF=.77\)
  • PD off: \(t(25)=-0.2\), \(p=.868\), \(d=0.0\), \(BF=.21\)
Code
df_chosen_group %>%
  group_by(group,chosen) %>%
  summarise(predicted_reward = mean(m_givenValue),
            confidence = mean(m_howSecure),
            true_reward = mean(m_true_z)) %>%
  kable(format = "html", escape = F, digits = 2) %>%
  kable_styling("striped", full_width = F)
Predicted rewards of chosen vs. not chosen options in bonus round.
group chosen predicted_reward confidence true_reward
Control chosen 0.49 4.73 0.44
Control not chosen 0.43 4.51 0.44
PD on chosen 0.50 4.52 0.48
PD on not chosen 0.46 4.39 0.44
PD off chosen 0.38 3.54 0.42
PD off not chosen 0.39 3.58 0.44
Code
# chosen vs not chosen: confidence
ggboxplot(df_chosen_group, x = "chosen", y = "m_howSecure",
          color = "group", palette =groupcolors, fill = "group", alpha = 0.2,
          add = "jitter", shape = "group", title = "Confidence of chosen vs. not chosen options",
          facet.by = "group") +
  ylab("Confidence in reward prediction") +
  xlab("") +
  stat_compare_means(comparisons =  list( c("chosen", "not chosen") ), paired = T, method = "t.test", label = "p.format") +
  stat_summary(fun = mean, geom="point", shape = 23, fill = "white", size=3) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.title = element_blank()
  )

# ggsave("plots/bonusround_chosen_not_chosen_options_confidence.png", dpi=300, width=7, height = 5)
Figure 12: Confidence in reward prediction of chosen vs. not chosen options in bonus round.

5 Appendix

5.1 Big overview figures

5.1.1 Task and computational model

Code
# get task illustration
img_task <- image_read("img/task.png")  
gimg_task <- rasterGrob(as.raster(img_task), interpolate = TRUE)

# add title
task <- ggdraw() +
  draw_label("Experiment and design", size = 24, x = 0.03, y = 1, hjust = 0, vjust = 1.5) +
  draw_grob(gimg_task, y = -0.02)

# get GP-UCB model illustration
img_model <- image_read("img/GP-UCB model.png")  
gimg_model <- rasterGrob(as.raster(img_model), interpolate = TRUE)

# add title
model <- ggdraw() +
  draw_label("Computational GP-UCB model", size = 24, x = 0.03, y = 1, hjust = 0, vjust = 1.5) +
  draw_grob(gimg_model, y = 0.05)

cowplot::plot_grid(
  task,
  model,
  ncol = 1,
  labels = c("auto"),      
  label_size = 22,
  label_y = c(1, 1),  # vertikale Position der cowplot-Labels (A, B)
  label_x = c(0, 0)   # horizontale Position (links)
)

ggsave("plots/task_model.png", dpi=300, height=10, width=15)
ggsave("plots/task_model.pdf", height=10, width=15, limitsize = FALSE )
Figure 13: Task and computational model

5.1.2 Behavioral results

Code
cowplot::plot_grid(
   p_performance, p_learning_curves,
   p_explore_exploit_choices, p_explore_exploit_choices_over_time,
   p_distance_consecutive, p_regression_distance_reward, #p_distance_consecutive p_types_choices_overall
   p_bonusround_prediction_error, p_bonusround_prediction_chosen,
   labels = c("auto"),
   label_size = 22,
   label_y = 1.02, 
   nrow = 4,
   rel_widths = c(0.45, 0.55)
)

ggsave("plots/behavioral_results.png", dpi=300, height=19, width=18)
ggsave("plots/behavioral_results.pdf", height=20, width=18)


# VERSION WITH TASK IN TOP ROW
# get task illustration
# img_task <- image_read("img/task.png")  
# gimg_task <- rasterGrob(as.raster(img_task), interpolate = TRUE)
# 
# # add title
# task <- ggdraw() +
#   draw_label("Experiment and design", size = 24, x = 0.05, y = 1, hjust = 0, vjust = 1.5) +
#   draw_grob(gimg_task)
# 
# # results plots
# results_grid <- plot_grid(
#   p_performance, p_learning_curves,
#   p_explore_exploit_choices, p_explore_exploit_choices_over_time,
#   p_distance_consecutive, p_regression_distance_reward,
#   labels = c("b", "c", "d", "e", "f", "g"),  
#   label_size = 22,
#   ncol = 2,
#   rel_widths = c(0.45, 0.55)
#   #align = "hv"
# )
# 
# cowplot::plot_grid(
#   task,
#   results_grid,
#   ncol = 1,
#   rel_heights = c(0.4, 1),  
#   labels = c("a", ""),      
#   label_size = 22
# )
# 
# ggsave("plots/behavioral_results.png", dpi=300, height=20, width=15)
# ggsave("plots/behavioral_results.pdf", height=20, width=15)
Figure 14: Behavioral results.
Code
### Bonus round
# cowplot::plot_grid(
#   p_bonusround_prediction_error,
#   p_bonusround_prediction_chosen,
#   nrow = 1,
#   rel_widths = c(0.45, 0.55),
#   labels = c("auto"),      
#   label_size = 22
# )
# 
# ggsave("plots/bonus_round.png", dpi=300, height=4, width=15)
# ggsave("plots/bonus_round.pdf", height=4, width=15)

5.2 Distribution of BDI, MMSE, and YH in each group

Code
# dotplot BDI
# p_dotplot_BDI <- 
ggplot(dat_sample, aes(x = BDI, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("BDI score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(
    legend.title = element_blank(),
    legend.position = 'none',
    strip.text = element_text(size=14),
    legend.text =  element_text(colour="black"),
    text = element_text(colour = "black"),
    strip.background =element_blank(),
    axis.text.x = element_text(colour="black"),
    axis.text.y = element_text(colour="black"),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank())

Code
# p_dotplot_MMSE <- 
ggplot(dat_sample, aes(x = MMSE, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("MMSE score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(plot.title = element_text(hjust = 0.5, size = 10),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background =element_blank(),
        axis.text.x = element_text(colour="black"),
        axis.text.y = element_text(colour="black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

Code
p_dotplot_HY <- ggplot(filter(dat_sample, group != "Control"), aes(x = hoehn_yahr, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("Hoehn-Yahr score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(plot.title = element_text(hjust = 0.5, size = 10),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background =element_blank(),
        axis.text.x = element_text(colour="black"),
        axis.text.y = element_text(colour="black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

5.3 Performance as function of BDI, MMSE, and Hoehn-Yahr

5.3.1 Performance as function of depression score (BDI-II)

The plots show performance as function of depression score (BDI-II), separately for each group. Opposing trends were found in the different groups: for patients with polyneuropathy (Control), there was a negative relation such that patients with higher depression scores obtained lower rewards. For the two Parkinson groups, the relation was positive, such that patients reporting more severe symptoms obtained higher rewards.

Code
df_correlation_groups_BDI <- df_mean_reward_subject %>%
  group_by(group) %>%
  summarise(correlation = cor(BDI, mean_reward, use = "complete.obs"))


ggplot(df_mean_reward_subject, aes(x = BDI, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  #geom_hline(data=filter(df_random_performance, environment=="Rough"), linetype="dotted", aes(yintercept=z_learn_envs)) + 
  #geom_hline(data=filter(df_random_performance, environment=="Smooth"), linetype="dotted",aes(yintercept=z_learn_envs)) +
  geom_smooth(colour = "black", linetype = "dashed", linewidth = 0.5, method="lm", se=T, alpha = 0.2) +
  #geom_jitter(aes(fill = BDI), shape = 21, alpha = jit_alpha, colour = "black") +
  geom_point(aes(fill = BDI), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean normalized reward") + 
  scale_x_continuous("BDI score") +
  coord_cartesian(xlim = c(0,15), ylim = c(.2,.9)) +
  ggtitle("Performance as function of depression score") +
  geom_text(data = df_correlation_groups_BDI, aes(x = 3.5, y = 0.25, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE, size = 5) +
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(size =16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.title = element_text(colour = "black", size = 14),
        axis.text = element_text(colour = "black", size = 14),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

# ggsave("plots/performance_groups_BDI.png", dpi=300, height = 3, width = 7)
Figure 15: Performance as function of depression score (BDI-II).

5.3.2 Performance as function of Mini-Mental State Examination (MMSE)

Higher values indicate better cognitive functioning.

Code
df_correlation_groups_MMSE <- df_mean_reward_subject %>%
  group_by(group) %>%
  summarise(correlation = cor(MMSE, mean_reward, use = "complete.obs"))

ggplot(df_mean_reward_subject, aes(x = MMSE, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  geom_smooth(colour = "black", linetype = "dashed", linewidth = 0.5, method="lm", se=T, alpha = 0.2) +
  geom_point(aes(fill = MMSE), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean normalized reward") + 
  scale_x_continuous("MMSE score", breaks = c(27,28,29,30)) +
  coord_cartesian(xlim = c(27,30), ylim = c(0.2,0.9)) +
  ggtitle("Performance as function of mental status (MMSE)") +
  geom_text(data = df_correlation_groups_MMSE, aes(x = 28, y = 0.25, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE, size = 5) +
  theme_classic() +
 theme(aspect.ratio = 1,
        plot.title = element_text(size =16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.title = element_text(colour = "black", size = 14),
        axis.text = element_text(colour = "black", size = 14),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

# ggsave("plots/performance_groups_MMSE.png", dpi=300, height = 3, width = 7)
Figure 16: Performance as function of Mini-Mental State Examination (MMSE).

5.3.3 Performance as function of Hoehn-Yahr (Parkinson patients only)

The Hoehn-Yahr scale provides basic information about the severity of motor impairments in Parkinson’s disease, with higher scores indicating greater severity.

Code
# separately for each group
df_correlation_groups_hoehn_yahr <- df_mean_reward_subject %>%  
  filter(group != "Control") %>% 
  group_by(group) %>%
  summarise(correlation = cor(hoehn_yahr, mean_reward, use = "complete.obs"))

ggplot(filter(df_mean_reward_subject, group != "Control"), aes(x = hoehn_yahr, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  #geom_hline(data=filter(df_random_performance, environment=="Rough"), linetype="dotted", aes(yintercept=z_learn_envs)) + 
  #geom_hline(data=filter(df_random_performance, environment=="Smooth"), linetype="dotted",aes(yintercept=z_learn_envs)) +
  geom_smooth(colour = "black", linetype = "dashed", size = 0.5, method="lm", se=T, alpha = 0.2) +
  geom_jitter(aes(fill = MMSE), shape = 21, alpha = jit_alpha, jit_width=0.0001, colour = "black") +
  #geom_point(aes(fill = hoehn_yahr), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean normalized reward") + 
  scale_x_continuous("Hoehn-Yahr score", breaks = c(1,2,3)) +
  coord_cartesian(xlim = c(1,3), ylim = c(0.2,0.9)) +
  ggtitle("Performance as function of Hoehn-Yahr score") +
  geom_text(data = df_correlation_groups_hoehn_yahr, aes(x = 1.3, y = 0.25, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE, size = 5) +
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(size =16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.title = element_text(colour = "black", size = 14),
        axis.text = element_text(colour = "black", size = 14),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

Performance as function of Hoehn-Yahr scale (Parkinson patients only).

Performance as function of Hoehn-Yahr scale (Parkinson patients only).
Code
# ggsave("plots/performance_groups_HY.png", dpi=300, height = 3, width = 5)

5.3.4 Reading list

  • Frank et al. (2004)
    Cognitive reinforcement learning in parkinsonism
    Demonstrated that Parkinson’s patients off medication learn poorly from positive feedback, implicating dopamine in reward learning and adaptive action selection.
    Frank et al. (2004): Cognitive reinforcement learning in parkinsonism. Science, 306, 1940–1943

  • Daw et al. (2006)
    Cortical substrates for exploratory decisions in humans
    Showed that exploratory decisions activate frontopolar cortex and are modulated by uncertainty, supporting a neural mechanism for explore–exploit arbitration.
    Daw et al. (2006): Cortical substrates for exploratory decisions in humans. Nature, 441, 876–879

  • Cohen et al. (2007)
    How the brain manages the exploration–exploitation tradeoff
    Proposed a theoretical framework linking locus coeruleus–norepinephrine function to exploration, via regulation of cortical neural gain.
    Cohen et al. (2007): How the brain manages the exploration–exploitation dilemma. Phil. Trans. R. Soc. B, 362, 933–942

  • Frank et al. (2009)
    Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation
    Found that genetic variation in dopamine pathways predicts differences in both uncertainty-directed and random exploration.
    Frank et al. (2009): Dopaminergic genes predict individual differences in exploration and exploitation. Nat Neurosci, 12, 1062–1068

  • Kayser et al. (2015)
    Dopamine, locus of control, and the exploration–exploitation tradeoff
    Showed that dopamine synthesis capacity interacts with locus of control to shape exploration strategies.
    Kayser et al. (2015): Dopamine, locus of control, and the exploration–exploitation tradeoff. Neuropsychopharmacology, 40, 454–462

  • Addicott et al. (2017)
    A primer on the explore/exploit trade-off for psychiatry
    Reviewed relevance of explore/exploit mechanisms to psychiatric disorders and their utility in clinical research.
    Addicott et al. (2017): A primer on the explore/exploit trade-off for psychiatry. Neuropsychopharmacology, 42, 1931–1939

  • Gershman & Tzovaras (2018)
    Dopaminergic genes are associated with both directed and random exploration
    Found that polymorphisms in dopaminergic genes relate selectively to directed exploration, not random variability.
    Gershman & Tzovaras (2018): Dopaminergic genes and exploration. Neuropsychologia, 120, 97–104

  • Cinotti et al. (2019)
    Dopamine blockade impairs the exploration–exploitation trade-off in rats
    Provided causal evidence that dopamine D2 receptor antagonism reduces exploratory choices in a multi-armed bandit task.
    Cinotti et al. (2019): Dopamine blockade impairs the explore–exploit trade-off. Sci Rep, 9, 6770

  • Chakroun et al. (2020)
    Dopaminergic modulation of the exploration/exploitation trade-off in human decision-making
    Showed that L-DOPA boosts uncertainty-directed exploration but not random exploration in humans.
    Chakroun et al. (2020): Dopaminergic modulation of exploration. eLife, 9, e51260

  • Cremer et al. (2023)
    Disentangling the roles of dopamine and noradrenaline in the exploration–exploitation trade-off
    Found that dopamine primarily modulates directed exploration, while noradrenaline affects choice variability.
    Cremer et al. (2023): Disentangling roles of dopamine and noradrenaline in explore/exploit. Neuropsychopharmacology, 48, 1078–1086

  • Chen et al. (2024)
    Dopamine and norepinephrine mediate the explore/exploit trade-off in humans
    Combined pupillometry, pharmacology, and fMRI to show distinct roles of dopamine and norepinephrine in exploration and exploitation.
    Chen et al. (2024): Dopamine and norepinephrine mediate explore/exploit tradeoff. J Neurosci, 44

  • Sharp et al (2019)
    *Dopamine selectively remediates ‘model-based’reward learning: a computational approach*
    Sharp, M. E., Foerde, K., Daw, N. D., & Shohamy, D. (2016). Dopamine selectively remediates ‘model-based’ reward learning: a computational approach. Brain, 139(2), 355-364 https://academic.oup.com/brain/article/139/2/355/1753805

References

Abbott, A. (2010). Levodopa: The story so far. Nature, 466(7310), S6–S7.
Beck, A. T., Steer, R. A., Brown, G. K., et al. (1996). Beck depression inventory.
Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). “Mini-mental state”: A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189–198.
Giron, A. P., Ciranka, S., Schulz, E., Bos, W. van den, Ruggeri, A., Meder, B., & Wu, C. M. (2023). Developmental changes in exploration resemble stochastic optimization. Nature Human Behaviour, 7(11), 1955–1967. https://doi.org/https://doi.org/10.1038/s41562-023-01662-1
Goetz, C. G., Poewe, W., Rascol, O., Sampaio, C., Stebbins, G. T., Counsell, C., Giladi, N., Holloway, R. G., Moore, C. G., Wenning, G. K., et al. (2004). Movement disorder society task force report on the hoehn and yahr staging scale: Status and recommendations the movement disorder society task force on rating scales for parkinson’s disease. Movement Disorders, 19(9), 1020–1028.
Hautzinger, M., Keller, F., & Kühner, C. (2006). Beck depressions-inventar (BDI-II). Harcourt Test Services.
Hoehn, M. M., & Yahr, M. D. (1967). Parkinsonism: Onset, progression, and mortality. Neurology, 17(5), 427–427.
Meder, B., Wu, C. M., Schulz, E., & Ruggeri, A. (2021). Development of directed and random exploration in children. Developmental Science, 24(4), e13095. https://doi.org/https://doi.org/10.1111/desc.13095
Sadeghiyeh, H., Wang, S., Alberhasky, M. R., Kyllo, H. M., Shenhav, A., & Wilson, R. C. (2020). Temporal discounting correlates with directed exploration but not with random exploration. Scientific Reports, 10(1), 4020.
Schulz, E., Wu, C. M., Ruggeri, A., & Meder, B. (2019). Searching for rewards like a child means less generalization and more directed exploration. Psychological Science, 30(11), 1561–1572. https://doi.org/10.1177/0956797619863663
Tambasco, N., Romoli, M., & Calabresi, P. (2018). Levodopa in parkinson’s disease: Current status and future developments. Current Neuropharmacology, 16(8), 1239–1252.
Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D., & Meder, B. (2018). Generalization guides human exploration in vast decision spaces. Nature Human Behaviour, 2, 915–924. https://doi.org/10.1038/s41562-018-0467-4